Pesquisa | Portal Regional da BVS

Annotation of biologically relevant ligands in UniProtKB using ChEBI.

Coudert, Elisabeth; Gehant, Sebastien; de Castro, Edouard; Pozzato, Monica; Baratin, Delphine; Neto, Teresa; Sigrist, Christian J A; Redaschi, Nicole; Bridge, Alan.

Bioinformatics ; 39(1)2023 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-36484697

RESUMO

MOTIVATION: To provide high quality, computationally tractable annotation of binding sites for biologically relevant (cognate) ligands in UniProtKB using the chemical ontology ChEBI (Chemical Entities of Biological Interest), to better support efforts to study and predict functionally relevant interactions between protein sequences and structures and small molecule ligands. RESULTS: We structured the data model for cognate ligand binding site annotations in UniProtKB and performed a complete reannotation of all cognate ligand binding sites using stable unique identifiers from ChEBI, which we now use as the reference vocabulary for all such annotations. We developed improved search and query facilities for cognate ligands in the UniProt website, REST API and SPARQL endpoint that leverage the chemical structure data, nomenclature and classification that ChEBI provides. AVAILABILITY AND IMPLEMENTATION: Binding site annotations for cognate ligands described using ChEBI are available for UniProtKB protein sequence records in several formats (text, XML and RDF) and are freely available to query and download through the UniProt website (www.uniprot.org), REST API (www.uniprot.org/help/api), SPARQL endpoint (sparql.uniprot.org/) and FTP site (https://ftp.uniprot.org/pub/databases/uniprot/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Bases de Conhecimento , Bases de Dados de Proteínas , Ligantes , Sequência de Aminoácidos , Sítios de Ligação , Anotação de Sequência Molecular

Diverse Taxonomies for Diverse Chemistries: Enhanced Representation of Natural Product Metabolism in UniProtKB.

Feuermann, Marc; Boutet, Emmanuel; Morgat, Anne; Axelsen, Kristian B; Bansal, Parit; Bolleman, Jerven; de Castro, Edouard; Coudert, Elisabeth; Gasteiger, Elisabeth; Géhant, Sébastien; Lieberherr, Damien; Lombardot, Thierry; Neto, Teresa B; Pedruzzi, Ivo; Poux, Sylvain; Pozzato, Monica; Redaschi, Nicole; Bridge, Alan.

Metabolites ; 11(1)2021 Jan 12.

Artigo em Inglês | MEDLINE | ID: mdl-33445429

RESUMO

The UniProt Knowledgebase UniProtKB is a comprehensive, high-quality, and freely accessible resource of protein sequences and functional annotation that covers genomes and proteomes from tens of thousands of taxa, including a broad range of plants and microorganisms producing natural products of medical, nutritional, and agronomical interest. Here we describe work that enhances the utility of UniProtKB as a support for both the study of natural products and for their discovery. The foundation of this work is an improved representation of natural product metabolism in UniProtKB using Rhea, an expert-curated knowledgebase of biochemical reactions, that is built on the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Knowledge of natural products and precursors is captured in ChEBI, enzyme-catalyzed reactions in Rhea, and enzymes in UniProtKB/Swiss-Prot, thereby linking chemical structure data directly to protein knowledge. We provide a practical demonstration of how users can search UniProtKB for protein knowledge relevant to natural products through interactive or programmatic queries using metabolite names and synonyms, chemical identifiers, chemical classes, and chemical structures and show how to federate UniProtKB with other data and knowledge resources and tools using semantic web technologies such as RDF and SPARQL. All UniProtKB data are freely available for download in a broad range of formats for users to further mine or exploit as an annotation source, to enrich other natural product datasets and databases.

HAMAP as SPARQL rules-A portable annotation pipeline for genomes and proteomes.

Bolleman, Jerven; de Castro, Edouard; Baratin, Delphine; Gehant, Sebastien; Cuche, Beatrice A; Auchincloss, Andrea H; Coudert, Elisabeth; Hulo, Chantal; Masson, Patrick; Pedruzzi, Ivo; Rivoire, Catherine; Xenarios, Ioannis; Redaschi, Nicole; Bridge, Alan.

Gigascience ; 9(2)2020 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-32034905

RESUMO

BACKGROUND: Genome and proteome annotation pipelines are generally custom built and not easily reusable by other groups. This leads to duplication of effort, increased costs, and suboptimal annotation quality. One way to address these issues is to encourage the adoption of annotation standards and technological solutions that enable the sharing of biological knowledge and tools for genome and proteome annotation. RESULTS: Here we demonstrate one approach to generate portable genome and proteome annotation pipelines that users can run without recourse to custom software. This proof of concept uses our own rule-based annotation pipeline HAMAP, which provides functional annotation for protein sequences to the same depth and quality as UniProtKB/Swiss-Prot, and the World Wide Web Consortium (W3C) standards Resource Description Framework (RDF) and SPARQL (a recursive acronym for the SPARQL Protocol and RDF Query Language). We translate complex HAMAP rules into the W3C standard SPARQL 1.1 syntax, and then apply them to protein sequences in RDF format using freely available SPARQL engines. This approach supports the generation of annotation that is identical to that generated by our own in-house pipeline, using standard, off-the-shelf solutions, and is applicable to any genome or proteome annotation pipeline. CONCLUSIONS: HAMAP SPARQL rules are freely available for download from the HAMAP FTP site, ftp://ftp.expasy.org/databases/hamap/sparql/, under the CC-BY-ND 4.0 license. The annotations generated by the rules are under the CC-BY 4.0 license. A tutorial and supplementary code to use HAMAP as SPARQL are available on GitHub at https://github.com/sib-swiss/HAMAP-SPARQL, and general documentation about HAMAP can be found on the HAMAP website at https://hamap.expasy.org.

Assuntos

Genômica/métodos , Anotação de Sequência Molecular/métodos , Análise de Sequência de DNA/métodos , Análise de Sequência de Proteína/métodos , Software/normas , Animais , Genômica/normas , Humanos , Anotação de Sequência Molecular/normas , Análise de Sequência de DNA/normas , Análise de Sequência de Proteína/normas

Enzyme annotation in UniProtKB using Rhea.

Morgat, Anne; Lombardot, Thierry; Coudert, Elisabeth; Axelsen, Kristian; Neto, Teresa Batista; Gehant, Sebastien; Bansal, Parit; Bolleman, Jerven; Gasteiger, Elisabeth; de Castro, Edouard; Baratin, Delphine; Pozzato, Monica; Xenarios, Ioannis; Poux, Sylvain; Redaschi, Nicole; Bridge, Alan.

Bioinformatics ; 36(6): 1896-1901, 2020 03 01.

Artigo em Inglês | MEDLINE | ID: mdl-31688925

RESUMO

MOTIVATION: To provide high quality computationally tractable enzyme annotation in UniProtKB using Rhea, a comprehensive expert-curated knowledgebase of biochemical reactions which describes reaction participants using the ChEBI (Chemical Entities of Biological Interest) ontology. RESULTS: We replaced existing textual descriptions of biochemical reactions in UniProtKB with their equivalents from Rhea, which is now the standard for annotation of enzymatic reactions in UniProtKB. We developed improved search and query facilities for the UniProt website, REST API and SPARQL endpoint that leverage the chemical structure data, nomenclature and classification that Rhea and ChEBI provide. AVAILABILITY AND IMPLEMENTATION: UniProtKB at https://www.uniprot.org; UniProt REST API at https://www.uniprot.org/help/api; UniProt SPARQL endpoint at https://sparql.uniprot.org/; Rhea at https://www.rhea-db.org.

Assuntos

Reiformes , Animais , Bases de Dados de Proteínas , Bases de Conhecimento

FAIR adoption, assessment and challenges at UniProt.

Garcia, Leyla; Bolleman, Jerven; Gehant, Sebastien; Redaschi, Nicole; Martin, Maria.

Sci Data ; 6(1): 175, 2019 09 20.

Artigo em Inglês | MEDLINE | ID: mdl-31541106

Genetic variations and diseases in UniProtKB/Swiss-Prot: the ins and outs of expert manual curation.

Famiglietti, Maria Livia; Estreicher, Anne; Gos, Arnaud; Bolleman, Jerven; Géhant, Sébastien; Breuza, Lionel; Bridge, Alan; Poux, Sylvain; Redaschi, Nicole; Bougueleret, Lydie; Xenarios, Ioannis.

Hum Mutat ; 35(8): 927-35, 2014 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-24848695

RESUMO

During the last few years, next-generation sequencing (NGS) technologies have accelerated the detection of genetic variants resulting in the rapid discovery of new disease-associated genes. However, the wealth of variation data made available by NGS alone is not sufficient to understand the mechanisms underlying disease pathogenesis and manifestation. Multidisciplinary approaches combining sequence and clinical data with prior biological knowledge are needed to unravel the role of genetic variants in human health and disease. In this context, it is crucial that these data are linked, organized, and made readily available through reliable online resources. The Swiss-Prot section of the Universal Protein Knowledgebase (UniProtKB/Swiss-Prot) provides the scientific community with a collection of information on protein functions, interactions, biological pathways, as well as human genetic diseases and variants, all manually reviewed by experts. In this article, we present an overview of the information content of UniProtKB/Swiss-Prot to show how this knowledgebase can support researchers in the elucidation of the mechanisms leading from a molecular defect to a disease phenotype.

Assuntos

Bases de Dados de Proteínas/estatística & dados numéricos , Estudos de Associação Genética , Genética Médica , Bases de Conhecimento , Proteoma , Software , Sequência de Aminoácidos , Variação Genética , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Anotação de Sequência Molecular , Dados de Sequência Molecular , Terminologia como Assunto

The EBI RDF platform: linked open data for the life sciences.

Jupp, Simon; Malone, James; Bolleman, Jerven; Brandizi, Marco; Davies, Mark; Garcia, Leyla; Gaulton, Anna; Gehant, Sebastien; Laibe, Camille; Redaschi, Nicole; Wimalaratne, Sarala M; Martin, Maria; Le Novère, Nicolas; Parkinson, Helen; Birney, Ewan; Jenkinson, Andrew M.

Bioinformatics ; 30(9): 1338-9, 2014 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-24413672

RESUMO

MOTIVATION: Resource description framework (RDF) is an emerging technology for describing, publishing and linking life science data. As a major provider of bioinformatics data and services, the European Bioinformatics Institute (EBI) is committed to making data readily accessible to the community in ways that meet existing demand. The EBI RDF platform has been developed to meet an increasing demand to coordinate RDF activities across the institute and provides a new entry point to querying and exploring integrated resources available at the EBI.

Assuntos

Biologia Computacional/métodos , Bases de Dados Genéticas , Academias e Institutos , Pesquisa Biomédica , Internet

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA